On predicting the popularity of newly emerging hashtags in Twitter
نویسندگان
چکیده
Because of Twitter’s popularity and the viral nature of information dissemination on Twitter, predicting which Twitter topics will become popular in the near future becomes a task of considerable economic importance. Many Twitter topics are annotated by hashtags. In this article, we propose methods to predict the popularity of new hashtags on Twitter by formulating the problem as a classification task. We use five standard classification models (i.e., Naïve bayes, k-nearest neighbors, decision trees, support vector machines, and logistic regression) for prediction. The main challenge is the identification of effective features for describing new hashtags. We extract 7 content features from a hashtag string and the collection of tweets containing the hashtag and 11 contextual features from the social graph formed by users who have adopted the hashtag. We conducted experiments on a Twitter data set consisting of 31 million tweets from 2 million Singapore-based users. The experimental results show that the standard classifiers using the extracted features significantly outperform the baseline methods that do not use these features. Among the five classifiers, the logistic regression model performs the best in terms of the Micro-F1 measure. We also observe that contextual features are more effective than content features.
منابع مشابه
LMPP: A Large Margin Point Process Combining Reinforcement and Competition for Modeling Hashtag Popularity
Predicting the popularity dynamics of Twitter hashtags has a broad spectrum of applications. Existing works have primarily focused on modeling the popularity of individual tweets rather than the underlying hashtags. As a result, they fail to consider several realistic factors contributing to hashtag popularity. In this paper, we propose Large Margin Point Process (LMPP), a probabilistic framewo...
متن کاملA Stratified Learning Approach for Predicting the Popularity of Twitter Idioms
Twitter Idioms are one of the important types of hashtags that spread in Twitter. In this paper, we propose a classifier that can stratify the Idioms from the other kind of hashtags with 86.93% accuracy and high precision and recall rate. We then learn regression models on the stratified samples (Idioms and non-Idioms) separately to predict the popularity of the Idioms. This stratification not ...
متن کاملSuggesting Hashtags on Twitter
As micro-blogging sites, like Twitter, continue to grow in popularity, we are presented with the problem of how to effectively categorize and search for posts. Looking specifically at Twitter, we see that users may categorize their posts using hashtags, and any word or phrase may be used as the category. Attempting to search for tweets about Facebook, a user would need to try many different has...
متن کاملDon't Let Me Be #Misunderstood: Linguistically Motivated Algorithm for Predicting the Popularity of Textual Memes
Prediction of the popularity of online textual snippets gained much attention in recent years. In this paper we investigate some of the factors that contribute to popularity of specific phrases such as Twitter hashtags. We define a new prediction task and propose a linguistically motivated algorithm for accurate prediction of hashtag popularity. Our prediction algorithm successfully models the ...
متن کاملModeling the Infectiousness of Twitter Hashtags
This study applies dynamical and statistical modeling techniques to quantify the proliferation and popularity of trending hashtags on Twitter. Using timeseries data reflecting actual tweets in New York City and San Francisco, we present estimates for the dynamics (i.e., rates of infection and recovery) of several hundred trending hashtags using an epidemic modeling framework coupled with Bayesi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JASIST
دوره 64 شماره
صفحات -
تاریخ انتشار 2013